Tootfinder

@arXiv_csCR_bot@mastoxiv.page
2024-05-10 06:48:15

Deep Multi-Task Learning for Malware Image Classification
Ahmed Bensaoud, Jugal Kalita
https://arxiv.org/abs/2405.05906 https://arxiv…

Deep Multi-Task Learning for Malware Image Classification
Malicious software is a pernicious global problem. A novel multi-task learning framework is proposed in this paper for malware image classification for accurate and fast malware detection. We generate bitmap (BMP) and (PNG) images from malware features, which we feed to a deep learning classifier. Our state-of-the-art multi-task learning approach has been tested on a new dataset, for which we have collected approximately 100,000 benign and malicious PE, APK, Mach-o, and ELF examples. Experiment…

@arXiv_csCV_bot@mastoxiv.page
2024-05-10 08:29:45

This https://arxiv.org/abs/2404.14955 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

A Comprehensive Survey for Hyperspectral Image Classification: The Evolution from Conventional to Transformers
Hyperspectral Image Classification (HSC) is a challenging task due to the high dimensionality and complex nature of Hyperspectral (HS) data. Traditional Machine Learning approaches while effective, face challenges in real-world data due to varying optimal feature sets, subjectivity in human-driven design, biases, and limitations. Traditional approaches encounter the curse of dimensionality, struggle with feature selection and extraction, lack spatial information consideration, exhibit limited r…

@arXiv_mathST_bot@mastoxiv.page
2024-04-11 06:59:19

Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization
Michael Kohler, Adam Krzyzak, Alisha S\"anger
https://arxiv.org/abs/2404.07128

Learning of deep convolutional network image classifiers via stochastic gradient descent and over-parametrization
Image classification from independent and identically distributed random variables is considered. Image classifiers are defined which are based on a linear combination of deep convolutional networks with max-pooling layer. Here all the weights are learned by stochastic gradient descent. A general result is presented which shows that the image classifiers are able to approximate the best possible deep convolutional network. In case that the a posteriori probability satisfies a suitable hierarchi…

@arXiv_csIT_bot@mastoxiv.page
2024-05-10 07:28:41

End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base
Shuling Li, Yaping Sun, Jinbei Zhang, Kechao Cai, Shuguang Cui, Xiaodong Xu
https://arxiv.org/abs/2405.05738

End-to-End Generative Semantic Communication Powered by Shared Semantic Knowledge Base
Semantic communication has drawn substantial attention as a promising paradigm to achieve effective and intelligent communications. However, efficient image semantic communication encounters challenges with a lower testing compression ratio (CR) compared to the training phase. To tackle this issue, we propose an innovative semantic knowledge base (SKB)-enabled generative semantic communication system for image classification and image generation tasks. Specifically, a lightweight SKB, comprisin…

@arXiv_statML_bot@mastoxiv.page
2024-03-11 08:48:53

This https://arxiv.org/abs/2401.12924 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_sta…

Performance Analysis of Support Vector Machine (SVM) on Challenging Datasets for Forest Fire Detection
This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SV…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-11 08:35:09

This https://arxiv.org/abs/2310.05446 has been replaced.
link: https://scholar.google.com/scholar?q=a

RetSeg: Retention-based Colorectal Polyps Segmentation Network
Vision Transformers (ViTs) have revolutionized medical imaging analysis, showcasing superior efficacy compared to conventional Convolutional Neural Networks (CNNs) in vital tasks such as polyp classification, detection, and segmentation. Leveraging attention mechanisms to focus on specific image regions, ViTs exhibit contextual awareness in processing visual data, culminating in robust and precise predictions, even for intricate medical images. Moreover, the inherent self-attention mechanism in…

@arXiv_physicsfludyn_bot@mastoxiv.page
2024-05-10 07:06:09

Mapping dissolved carbon in space and time: An experimental technique for the measurement of pH and total carbon concentration in density driven convection of CO$_2$ dissolved in water
Hilmar Yngvi Birggison, Yao Xu, Marcel Moura, Eirik Grude Flekk{\o}y, Knut J{\o}rgen M{\aa}l{\o}y
https://arxiv.org/abs/2405.05682

Mapping dissolved carbon in space and time: An experimental technique for the measurement of pH and total carbon concentration in density driven convection of CO$_2$ dissolved in water
We present an experimental technique for determining the pH and the total carbon concentration when \ch{CO2} diffuses and flows in water. The technique employs three different pH indicators, which, when combined with an image analysis technique, provides a dynamic range in pH from 4.0 to 9.5. In contrast to usual techniques in which a single pH indicator is used, the methodology presented allows not only to produce a binary classification (pH larger or smaller than a given threshold) but to acc…

@arXiv_csCV_bot@mastoxiv.page
2024-03-08 08:30:06

This https://arxiv.org/abs/2402.15784 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

IRConStyle: Image Restoration Framework Using Contrastive Learning and Style Transfer
Recently, the contrastive learning paradigm has achieved remarkable success in high-level tasks such as classification, detection, and segmentation. However, contrastive learning applied in low-level tasks, like image restoration, is limited, and its effectiveness is uncertain. This raises a question: Why does the contrastive learning paradigm not yield satisfactory results in image restoration? In this paper, we conduct in-depth analyses and propose three guidelines to address the above questi…

@arXiv_csAI_bot@mastoxiv.page
2024-04-08 08:27:49

This https://arxiv.org/abs/2309.08395 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csAI_…

Learning by Self-Explaining
Current AI research mainly treats explanations as a means for model inspection. Yet, this neglects findings from human psychology that describe the benefit of self-explanations in an agent's learning process. Motivated by this, we introduce a novel approach in the context of image classification, termed Learning by Self-Explaining (LSX). LSX utilizes aspects of self-refining AI and human-guided explanatory machine learning. The underlying idea is that a learner model, in addition to optimizing …

@arXiv_eessIV_bot@mastoxiv.page
2024-04-08 06:53:57

LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification
Judy X Yang, Jun Zhou, Jing Wang, Hui Tian, Wee Chung Liew
https://arxiv.org/abs/2404.03883

LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification
The fusion of hyperspectral and LiDAR data has been an active research topic. Existing fusion methods have ignored the high-dimensionality and redundancy challenges in hyperspectral images, despite that band selection methods have been intensively studied for hyperspectral image (HSI) processing. This paper addresses this significant gap by introducing a cross-attention mechanism from the transformer architecture for the selection of HSI bands guided by LiDAR data. LiDAR provides high-resolutio…

@arXiv_astrophCO_bot@mastoxiv.page
2024-05-09 07:34:47

Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo
Nayantara Mudur, Carolina Cuesta-Lazaro, Douglas P. Finkbeiner
https://arxiv.org/abs/2405.05255

Diffusion-HMC: Parameter Inference with Diffusion Model driven Hamiltonian Monte Carlo
Diffusion generative models have excelled at diverse image generation and reconstruction tasks across fields. A less explored avenue is their application to discriminative tasks involving regression or classification problems. The cornerstone of modern cosmology is the ability to generate predictions for observed astrophysical fields from theory and constrain physical models from observations using these predictions. This work uses a single diffusion generative model to address these interlinke…

@arXiv_astrophIM_bot@mastoxiv.page
2024-05-07 06:59:58

Bayesian and Convolutional Networks for Hierarchical Morphological Classification of Galaxies
Jonathan Serrano-P\'erez, Raquel D\'iaz Hern\'andez, L. Enrique Sucar
https://arxiv.org/abs/2405.02366

Bayesian and Convolutional Networks for Hierarchical Morphological Classification of Galaxies
This work is focused on the morphological classification of galaxies following the Hubble sequence in which the different classes are arranged in a hierarchy. The proposed method, BCNN, is composed of two main modules. First, a convolutional neural network (CNN) is trained with images of the different classes of galaxies (image augmentation is carried out to balance some classes); the CNN outputs the probability for each class of the hierarchy, and its outputs/predictions feed the second module…

@arXiv_csCL_bot@mastoxiv.page
2024-03-07 08:24:44

This https://arxiv.org/abs/2203.11155 has been replaced.
link: https://scholar.google.com/scholar?q=a

Quantum Neural Network with Density Matrix for Question Answering and Classical Image Classification
Quantum density matrix represents all the information of the entire quantum system, and novel models of meaning employing density matrices naturally model linguistic phenomena such as hyponymy and linguistic ambiguity, among others in quantum question answering tasks. Naturally, we argue that applying the quantum density matrix into classical Question Answering (QA) tasks can show more effective performance. Specifically, we (i) design a new mechanism based on Long Short-Term Memory (LSTM) to a…

@arXiv_csHC_bot@mastoxiv.page
2024-04-30 07:24:33

How Deep Is Your Gaze? Leveraging Distance in Image-Based Gaze Analysis
Maurice Koch, Nelusa Pathmanathan, Daniel Weiskopf, Kuno Kurzhals
https://arxiv.org/abs/2404.18680

How Deep Is Your Gaze? Leveraging Distance in Image-Based Gaze Analysis
Image thumbnails are a valuable data source for fixation filtering, scanpath classification, and visualization of eye tracking data. They are typically extracted with a constant size, approximating the foveated area. As a consequence, the focused area of interest in the scene becomes less prominent in the thumbnail with increasing distance, affecting image-based analysis techniques. In this work, we propose depth-adaptive thumbnails, a method for varying image size according to the eye-to-objec…

@arXiv_csNE_bot@mastoxiv.page
2024-04-08 08:31:43

This https://arxiv.org/abs/2404.03493 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csNE_…

A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data
Autonomous Driving (AD) systems are considered as the future of human mobility and transportation. Solving computer vision tasks such as image classification and object detection/segmentation, with high accuracy and low power/energy consumption, is highly needed to realize AD systems in real life. These requirements can potentially be satisfied by Spiking Neural Networks (SNNs). However, the state-of-the-art works in SNN-based AD systems still focus on proposing network models that can achieve …

@arXiv_eessIV_bot@mastoxiv.page
2024-03-07 07:28:30

MedMamba: Vision Mamba for Medical Image Classification
Yubiao Yue, Zhenzhang Li
https://arxiv.org/abs/2403.03849 https://arxiv.org/p…

MedMamba: Vision Mamba for Medical Image Classification
Medical image classification is a very fundamental and crucial task in the field of computer vision. These years, CNN-based and Transformer-based models are widely used in classifying various medical images. Unfortunately, The limitation of CNNs in long-range modeling capabilities prevent them from effectively extracting fine-grained features in medical images , while Transformers are hampered by their quadratic computational complexity. Recent research has shown that the state space model (SSM…

@arXiv_statME_bot@mastoxiv.page
2024-03-07 07:21:39

A consensus-constrained parsimonious Gaussian mixture model for clustering hyperspectral images
Ganesh Babu, Aoife Gowen, Michael Fop, Isobel Claire Gormley
https://arxiv.org/abs/2403.03349

A consensus-constrained parsimonious Gaussian mixture model for clustering hyperspectral images
The use of hyperspectral imaging to investigate food samples has grown due to the improved performance and lower cost of spectroscopy instrumentation. Food engineers use hyperspectral images to classify the type and quality of a food sample, typically using classification methods. In order to train these methods, every pixel in each training image needs to be labelled. Typically, computationally cheap threshold-based approaches are used to label the pixels, and classification methods are traine…

@arXiv_csCE_bot@mastoxiv.page
2024-05-07 07:20:10

Development and Validation of an Artificial Neural Network for the Recognition of Custom Dataset with YOLOv4
P. Veysi, M. Adeli, N. Peirov Naziri
https://arxiv.org/abs/2405.02298 …

Development and Validation of an Artificial Neural Network for the Recognition of Custom Dataset with YOLOv4
The expanding applications, utilized by more users, enhance hardware performance and further develop cloud systems for big data processing. This leads to numerous unexplored deep learning applications, especially in advanced computer vision for object recognition. Deep learning in image processing encompasses varied tasks from recognizing elements with diverse shapes and sizes to complex element classification, coping with varying backgrounds and lighting conditions, and text recognition. Its a…

@arXiv_csCR_bot@mastoxiv.page
2024-04-08 07:22:50

Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism
Trilokesh Ranjan Sarkar, Nilanjan Das, Pralay Sankar Maitra, Bijoy Some, Ritwik Saha, Orijita Adhikary, Bishal Bose, Jaydip Sen
https://arxiv.org/abs/2404.04245<…

Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism
This technical report delves into an in-depth exploration of adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification. The study also investigates defense mechanisms aimed at bolstering the robustness of machine learning models. The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Method (FGSM) and the Carlini-Wagner (CW) approach. These attacks are examined concerning three pre-tr…

@arXiv_csLG_bot@mastoxiv.page
2024-05-02 07:17:44

Data Augmentation Policy Search for Long-Term Forecasting
Liran Nochumsohn, Omri Azencot
https://arxiv.org/abs/2405.00319 https://arx…

Data Augmentation Policy Search for Long-Term Forecasting
Data augmentation serves as a popular regularization technique to combat overfitting challenges in neural networks. While automatic augmentation has demonstrated success in image classification tasks, its application to time-series problems, particularly in long-term forecasting, has received comparatively less attention. To address this gap, we introduce a time-series automatic augmentation approach named TSAA, which is both efficient and easy to implement. The solution involves tackling the a…

@arXiv_csNE_bot@mastoxiv.page
2024-04-08 08:31:43

This https://arxiv.org/abs/2404.03493 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csNE_…

A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data
Autonomous Driving (AD) systems are considered as the future of human mobility and transportation. Solving computer vision tasks such as image classification and object detection/segmentation, with high accuracy and low power/energy consumption, is highly needed to realize AD systems in real life. These requirements can potentially be satisfied by Spiking Neural Networks (SNNs). However, the state-of-the-art works in SNN-based AD systems still focus on proposing network models that can achieve …

@arXiv_eessIV_bot@mastoxiv.page
2024-03-08 06:53:53

Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: a data-driven approach for improved classification
Ricardo Bigolin Lanfredi, Pritam Mukherjee, Ronald Summers
https://arxiv.org/abs/2403.04024

Enhancing chest X-ray datasets with privacy-preserving large language models and multi-type annotations: a data-driven approach for improved classification
In chest X-ray (CXR) image analysis, rule-based systems are usually employed to extract labels from reports, but concerns exist about label quality. These datasets typically offer only presence labels, sometimes with binary uncertainty indicators, which limits their usefulness. In this work, we present MAPLEZ (Medical report Annotations with Privacy-preserving Large language model using Expeditious Zero shot answers), a novel approach leveraging a locally executable Large Language Model (LLM) t…

@arXiv_csCR_bot@mastoxiv.page
2024-03-08 08:28:54

This https://arxiv.org/abs/2402.16896 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCR_…

On Trojan Signatures in Large Language Models of Code
Trojan signatures, as described by Fields et al. (2021), are noticeable differences in the distribution of the trojaned class parameters (weights) and the non-trojaned class parameters of the trojaned model, that can be used to detect the trojaned model. Fields et al. (2021) found trojan signatures in computer vision classification tasks with image models, such as, Resnet, WideResnet, Densenet, and VGG. In this paper, we investigate such signatures in the classifier layer parameters of large la…

@JLBe@mastodon.social
2024-02-26 15:53:08

Parkinson's is a neurodegenerative #disease that usually manifests itself through tremor. A recent #study used surface electromyography to examine the characteristics of these rhythmic movements in order to investigate early diagnosis.

The illustration on the left shows the setup of the multi-sensor signal acquisition platform. The patient wears the sEMG (surface electromyography) electrodes on both arms. The signals are transmitted to a computer via Bluetooth. In addition, video material is recorded by a camera for evaluation. The image on the right shows the exercises that the patients had to perform during the study: a) placing their hands on the back of the chair, b) extending their arms, c) pronation/supination (turning …

Hand Movement Recognition and Salient Tremor Feature Extraction With Wearable Devices in Parkinson’s Patients
Tremor is one of the earliest signs of Parkinson’s disease (PD), which seriously disrupts patients’ daily lives. It is important to study upper limb tremors quantitatively to control PD progression. In this study, surface electromyography (sEMG) signals from wearable devices are used to recognize rest, posture, and kinetic tremor action from six upper limb clinical actions and to quantify features of tremors. A multivariable time-series classification model (MTSCM) based on fully convolutio…

@arXiv_csCV_bot@mastoxiv.page
2024-03-06 08:32:23

This https://arxiv.org/abs/2403.01944 has been replaced.
link: https://scholar.google.com/scholar?q=a

Fourier-basis Functions to Bridge Augmentation Gap: Rethinking Frequency Augmentation in Image Classification
Computer vision models normally witness degraded performance when deployed in real-world scenarios, due to unexpected changes in inputs that were not accounted for during training. Data augmentation is commonly used to address this issue, as it aims to increase data variety and reduce the distribution gap between training and test data. However, common visual augmentations might not guarantee extensive robustness of computer vision models. In this paper, we propose Auxiliary Fourier-basis Augme…

@arXiv_mathNT_bot@mastoxiv.page
2024-02-27 08:30:11

This https://arxiv.org/abs/2311.07740 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…

Towards a classification of isolated $j$-invariants
We develop an algorithm to test whether a non-CM elliptic curve $E/\mathbb{Q}$ gives rise to an isolated point of any degree on any modular curve of the form $X_1(N)$. This builds on prior work of Zywina which gives a method for computing the image of the adelic Galois representation associated to $E$. Running this algorithm on all elliptic curves presently in the $L$-functions and Modular Forms Database and the Stein-Watkins Database gives strong evidence for the conjecture that $E$ gives rise…

@arXiv_csAR_bot@mastoxiv.page
2024-03-01 06:46:57

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration
Bo Liu, Grace Li Zhang, Xunzhao Yin, Ulf Schlichtmann, Bing Li
https://arxiv.org/abs/2402.18595

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration
Deep neural networks (DNNs) have achieved great breakthroughs in many fields such as image classification and natural language processing. However, the execution of DNNs needs to conduct massive numbers of multiply-accumulate (MAC) operations on hardware and thus incurs a large power consumption. To address this challenge, we propose a novel digital MAC design based on encoding. In this new design, the multipliers are replaced by simple logic gates to project the results onto a wide bit represe…

@arXiv_eessIV_bot@mastoxiv.page
2024-05-06 07:31:55

Deep Learning Descriptor Hybridization with Feature Reduction for Accurate Cervical Cancer Colposcopy Image Classification
Saurabh Saini, Kapil Ahuja, Siddartha Chennareddy, Karthik Boddupalli
https://arxiv.org/abs/2405.01600

Deep Learning Descriptor Hybridization with Feature Reduction for Accurate Cervical Cancer Colposcopy Image Classification
Cervical cancer stands as a predominant cause of female mortality, underscoring the need for regular screenings to enable early diagnosis and preemptive treatment of pre-cancerous conditions. The transformation zone in the cervix, where cellular differentiation occurs, plays a critical role in the detection of abnormalities. Colposcopy has emerged as a pivotal tool in cervical cancer prevention since it provides a meticulous examination of cervical abnormalities. However, challenges in visual e…

@arXiv_csLG_bot@mastoxiv.page
2024-05-02 07:17:44

Data Augmentation Policy Search for Long-Term Forecasting
Liran Nochumsohn, Omri Azencot
https://arxiv.org/abs/2405.00319 https://arx…

Data Augmentation Policy Search for Long-Term Forecasting
Data augmentation serves as a popular regularization technique to combat overfitting challenges in neural networks. While automatic augmentation has demonstrated success in image classification tasks, its application to time-series problems, particularly in long-term forecasting, has received comparatively less attention. To address this gap, we introduce a time-series automatic augmentation approach named TSAA, which is both efficient and easy to implement. The solution involves tackling the a…

@arXiv_physicsbioph_bot@mastoxiv.page
2024-03-26 07:20:36

On machine learning analysis of atomic force microscopy images for image classification, sample surface recognition
Igor Sokolov
https://arxiv.org/abs/2403.16230

On machine learning analysis of atomic force microscopy images for image classification, sample surface recognition
Atomic force microscopy (AFM or SPM) imaging is one of the best matches with machine learning (ML) analysis among microscopy techniques. The digital format of AFM images allows for direct utilization in ML algorithms without the need for additional processing. Additionally, AFM enables the simultaneous imaging of distributions of over a dozen different physicochemical properties of sample surfaces, a process known as multidimensional imaging. While this wealth of information can be challenging …

@arXiv_csHC_bot@mastoxiv.page
2024-04-16 07:18:49

Interaction as Explanation: A User Interaction-based Method for Explaining Image Classification Models
Hyeonggeun Yun
https://arxiv.org/abs/2404.09828 http…

Interaction as Explanation: A User Interaction-based Method for Explaining Image Classification Models
In computer vision, explainable AI (xAI) methods seek to mitigate the 'black-box' problem by making the decision-making process of deep learning models more interpretable and transparent. Traditional xAI methods concentrate on visualizing input features that influence model predictions, providing insights primarily suited for experts. In this work, we present an interaction-based xAI method that enhances user comprehension of image classification models through their interaction. Thus, we devel…

@arXiv_eessIV_bot@mastoxiv.page
2024-05-06 07:31:59

Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey
Guoping Xu, Xiaxia Wang, Xinglong Wu, Xuesong Leng, Yongchao Xu
https://arxiv.org/abs/2405.01725

Development of Skip Connection in Deep Neural Networks for Computer Vision and Medical Image Analysis: A Survey
Deep learning has made significant progress in computer vision, specifically in image classification, object detection, and semantic segmentation. The skip connection has played an essential role in the architecture of deep neural networks,enabling easier optimization through residual learning during the training stage and improving accuracy during testing. Many neural networks have inherited the idea of residual learning with skip connections for various tasks, and it has been the standard cho…

@arXiv_csCV_bot@mastoxiv.page
2024-04-05 08:32:13

This https://arxiv.org/abs/2404.02388 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation
Deep Neural Networks (DNNs) are widely used for visual classification tasks, but their complex computation process and black-box nature hinder decision transparency and interpretability. Class activation maps (CAMs) and recent variants provide ways to visually explain the DNN decision-making process by displaying 'attention' heatmaps of the DNNs. Nevertheless, the CAM explanation only offers relative attention information, that is, on an attention heatmap, we can interpret which image region is…

@arXiv_csNE_bot@mastoxiv.page
2024-04-05 07:15:17

A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data
Iqra Bano, Rachmad Vidya Wicaksana Putra, Alberto Marchisio, Muhammad Shafique
https://arxiv.org/abs/2404.03493

A Methodology to Study the Impact of Spiking Neural Network Parameters considering Event-Based Automotive Data
Autonomous Driving (AD) systems are considered as the future of human mobility and transportation. Solving computer vision tasks such as image classification and object detection/segmentation, with high accuracy and low power/energy consumption, is highly needed to realize AD systems in real life. These requirements can potentially be satisfied by Spiking Neural Networks (SNNs). However, the state-of-the-art works in SNN-based AD systems still focus on proposing network models that can achieve …

@arXiv_mathNT_bot@mastoxiv.page
2024-02-27 08:30:11

This https://arxiv.org/abs/2311.07740 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_mat…

Towards a classification of isolated $j$-invariants
We develop an algorithm to test whether a non-CM elliptic curve $E/\mathbb{Q}$ gives rise to an isolated point of any degree on any modular curve of the form $X_1(N)$. This builds on prior work of Zywina which gives a method for computing the image of the adelic Galois representation associated to $E$. Running this algorithm on all elliptic curves presently in the $L$-functions and Modular Forms Database and the Stein-Watkins Database gives strong evidence for the conjecture that $E$ gives rise…

@arXiv_csCV_bot@mastoxiv.page
2024-04-05 08:32:13

This https://arxiv.org/abs/2404.02388 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

CAPE: CAM as a Probabilistic Ensemble for Enhanced DNN Interpretation
Deep Neural Networks (DNNs) are widely used for visual classification tasks, but their complex computation process and black-box nature hinder decision transparency and interpretability. Class activation maps (CAMs) and recent variants provide ways to visually explain the DNN decision-making process by displaying 'attention' heatmaps of the DNNs. Nevertheless, the CAM explanation only offers relative attention information, that is, on an attention heatmap, we can interpret which image region is…

@arXiv_eessIV_bot@mastoxiv.page
2024-02-27 06:54:20

Investigating the Robustness of Vision Transformers against Label Noise in Medical Image Classification
Bidur Khanal, Prashant Shrestha, Sanskar Amgain, Bishesh Khanal, Binod Bhattarai, Cristian A. Linte
https://arxiv.org/abs/2402.16734

Investigating the Robustness of Vision Transformers against Label Noise in Medical Image Classification
Label noise in medical image classification datasets significantly hampers the training of supervised deep learning methods, undermining their generalizability. The test performance of a model tends to decrease as the label noise rate increases. Over recent years, several methods have been proposed to mitigate the impact of label noise in medical image classification and enhance the robustness of the model. Predominantly, these works have employed CNN-based architectures as the backbone of thei…

@arXiv_csCV_bot@mastoxiv.page
2024-04-05 08:32:10

This https://arxiv.org/abs/2404.02282 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Smooth Deep Saliency
In this work, we investigate methods to reduce the noise in deep saliency maps coming from convolutional downsampling, with the purpose of explaining how a deep learning model detects tumors in scanned histological tissue samples. Those methods make the investigated models more interpretable for gradient-based saliency maps, computed in hidden layers. We test our approach on different models trained for image classification on ImageNet1K, and models trained for tumor detection on Camelyon16 and…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-03 08:42:13

This https://arxiv.org/abs/2403.03849 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

MedMamba: Vision Mamba for Medical Image Classification
Medical image classification is a very fundamental and crucial task in the field of computer vision. These years, CNN-based and Transformer-based models have been widely used to classify various medical images. Unfortunately, The limitation of CNNs in long-range modeling capabilities prevents them from effectively extracting features in medical images, while Transformers are hampered by their quadratic computational complexity. Recent research has shown that the state space model (SSM) represen…

@arXiv_csCL_bot@mastoxiv.page
2024-03-22 06:55:11

LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding
Masato Fujitake
https://arxiv.org/abs/2403.14252 https://

LayoutLLM: Large Language Model Instruction Tuning for Visually Rich Document Understanding
This paper proposes LayoutLLM, a more flexible document analysis method for understanding imaged documents. Visually Rich Document Understanding tasks, such as document image classification and information extraction, have gained significant attention due to their importance. Existing methods have been developed to enhance document comprehension by incorporating pre-training awareness of images, text, and layout structure. However, these methods require fine-tuning for each task and dataset, an…

@arXiv_mathST_bot@mastoxiv.page
2024-04-30 08:43:22

This https://arxiv.org/abs/2011.13602 has been replaced.
link: https://scholar.google.com/scholar?q=a

Statistical theory for image classification using deep convolutional neural networks with cross-entropy loss under the hierarchical max-pooling model
Convolutional neural networks (CNNs) trained with cross-entropy loss have proven to be extremely successful in classifying images. In recent years, much work has been done to also improve the theoretical understanding of neural networks. Nevertheless, it seems limited when these networks are trained with cross-entropy loss, mainly because of the unboundedness of the target function. In this paper, we aim to fill this gap by analyzing the rate of the excess risk of a CNN classifier trained by cr…

@arXiv_csCV_bot@mastoxiv.page
2024-04-05 08:32:10

This https://arxiv.org/abs/2404.02282 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Smooth Deep Saliency
In this work, we investigate methods to reduce the noise in deep saliency maps coming from convolutional downsampling, with the purpose of explaining how a deep learning model detects tumors in scanned histological tissue samples. Those methods make the investigated models more interpretable for gradient-based saliency maps, computed in hidden layers. We test our approach on different models trained for image classification on ImageNet1K, and models trained for tumor detection on Camelyon16 and…

@arXiv_csHC_bot@mastoxiv.page
2024-02-13 14:35:38

This https://arxiv.org/abs/2311.12481 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csHC_…

Interpretability is in the eye of the beholder: Human versus artificial classification of image segments generated by humans versus XAI
The evaluation of explainable artificial intelligence is challenging, because automated and human-centred metrics of explanation quality may diverge. To clarify their relationship, we investigated whether human and artificial image classification will benefit from the same visual explanations. In three experiments, we analysed human reaction times, errors, and subjective ratings while participants classified image segments. These segments either reflected human attention (eye movements, manual …

@arXiv_eessIV_bot@mastoxiv.page
2024-04-03 08:42:13

This https://arxiv.org/abs/2403.03849 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

MedMamba: Vision Mamba for Medical Image Classification
Medical image classification is a very fundamental and crucial task in the field of computer vision. These years, CNN-based and Transformer-based models have been widely used to classify various medical images. Unfortunately, The limitation of CNNs in long-range modeling capabilities prevents them from effectively extracting features in medical images, while Transformers are hampered by their quadratic computational complexity. Recent research has shown that the state space model (SSM) represen…

@arXiv_csCV_bot@mastoxiv.page
2024-03-01 07:06:35

Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance
Huakun Shen, Boyue Caroline Hu, Krzysztof Czarnecki, Lina Marsso, Marsha Chechik
https://arxiv.org/abs/2402.19401

Assessing Visually-Continuous Corruption Robustness of Neural Networks Relative to Human Performance
While Neural Networks (NNs) have surpassed human accuracy in image classification on ImageNet, they often lack robustness against image corruption, i.e., corruption robustness. Yet such robustness is seemingly effortless for human perception. In this paper, we propose visually-continuous corruption robustness (VCR) -- an extension of corruption robustness to allow assessing it over the wide and continuous range of changes that correspond to the human perceptive quality (i.e., from the original …

@arXiv_csCV_bot@mastoxiv.page
2024-04-26 08:32:31

This https://arxiv.org/abs/2404.11003 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

InfoMatch: Entropy Neural Estimation for Semi-Supervised Image Classification
Semi-supervised image classification, leveraging pseudo supervision and consistency regularization, has demonstrated remarkable success. However, the ongoing challenge lies in fully exploiting the potential of unlabeled data. To address this, we employ information entropy neural estimation to harness the potential of unlabeled samples. Inspired by contrastive learning, the entropy is estimated by maximizing a lower bound on mutual information across different augmented views. Moreover, we theor…

@arXiv_eessIV_bot@mastoxiv.page
2024-05-01 06:53:52

Remote Sensing Image Enhancement through Spatiotemporal Filtering
Hessah Albanwan
https://arxiv.org/abs/2404.18950 https://arxiv.org/pdf/2404.18950
arXiv:2404.18950v1 Announce Type: new
Abstract: The analysis of time-sequence satellite images is a powerful tool in remote sensing; it is used to explore the statics and dynamics of the surface of the earth. Usually, the quality of multitemporal images is influenced by metrological conditions, high reflectance of surfaces, illumination, and satellite sensor conditions. These negative influences may produce noises and different radiances and appearances between the images, which can affect the applications that process them. Thus, a spatiotemporal bilateral filter has been adopted in this research to enhance the quality of an image before using it in any application. The filter takes advantage of the temporal information provided by multi temporal images and attempts to reduce the differences between them to improve transfer learning used in classification. The classification method used here is support vector machine (SVM). Three experiments were conducted in this research, two were on Landsat 8 images with low-medium resolution, and the third on high-resolution images of Planet satellite. The newly developed filter proved that it can enhance the accuracy of classification using transfer learning by about 5%,15%, and 2% for the three experiments respectively.

@arXiv_csCV_bot@mastoxiv.page
2024-02-26 08:30:22

This https://arxiv.org/abs/2402.13699 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Explainable Classification Techniques for Quantum Dot Device Measurements
In the physical sciences, there is an increased need for robust feature representations of image data: image acquisition, in the generalized sense of two-dimensional data, is now widespread across a large number of fields, including quantum information science, which we consider here. While traditional image features are widely utilized in such cases, their use is rapidly being supplanted by Neural Network-based techniques that often sacrifice explainability in exchange for high accuracy. To am…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-28 06:53:52

Integrative Graph-Transformer Framework for Histopathology Whole Slide Image Representation and Classification
Zhan Shi, Jingwei Zhang, Jun Kong, Fusheng Wang
https://arxiv.org/abs/2403.18134

Integrative Graph-Transformer Framework for Histopathology Whole Slide Image Representation and Classification
In digital pathology, the multiple instance learning (MIL) strategy is widely used in the weakly supervised histopathology whole slide image (WSI) classification task where giga-pixel WSIs are only labeled at the slide level. However, existing attention-based MIL approaches often overlook contextual information and intrinsic spatial relationships between neighboring tissue tiles, while graph-based MIL frameworks have limited power to recognize the long-range dependencies. In this paper, we intr…

@arXiv_csNE_bot@mastoxiv.page
2024-04-29 08:32:01

This https://arxiv.org/abs/2306.12465 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csNE_…

Efficient Deep Spiking Multi-Layer Perceptrons with Multiplication-Free Inference
Advancements in adapting deep convolution architectures for Spiking Neural Networks (SNNs) have significantly enhanced image classification performance and reduced computational burdens. However, the inability of Multiplication-Free Inference (MFI) to align with attention and transformer mechanisms, which are critical to superior performance on high-resolution vision tasks, imposing limitations on these gains. To address this, our research explores a new pathway, drawing inspiration from the pr…

@arXiv_csCV_bot@mastoxiv.page
2024-02-23 06:53:43

Text Role Classification in Scientific Charts Using Multimodal Transformers
Hye Jin Kim, Nicolas Lell, Ansgar Scherp
https://arxiv.org/abs/2402.14579 https…

Text Role Classification in Scientific Charts Using Multimodal Transformers
Text role classification involves classifying the semantic role of textual elements within scientific charts. For this task, we propose to finetune two pretrained multimodal document layout analysis models, LayoutLMv3 and UDOP, on chart datasets. The transformers utilize the three modalities of text, image, and layout as input. We further investigate whether data augmentation and balancing methods help the performance of the models. The models are evaluated on various chart datasets, and result…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-28 06:54:00

Deep Learning Segmentation and Classification of Red Blood Cells Using a Large Multi-Scanner Dataset
Mohamed Elmanna, Ahmed Elsafty, Yomna Ahmed, Muhammad Rushdi, Ahmed Morsy
https://arxiv.org/abs/2403.18468

Deep Learning Segmentation and Classification of Red Blood Cells Using a Large Multi-Scanner Dataset
Digital pathology has recently been revolutionized by advancements in artificial intelligence, deep learning, and high-performance computing. With its advanced tools, digital pathology can help improve and speed up the diagnostic process, reduce human errors, and streamline the reporting step. In this paper, we report a new large red blood cell (RBC) image dataset and propose a two-stage deep learning framework for RBC image segmentation and classification. The dataset is a highly diverse datas…

@arXiv_csCV_bot@mastoxiv.page
2024-03-01 07:06:27

Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction
Hao Li, Ying Chen, Yifei Chen, Wenxian Yang, Bowen Ding, Yuchen Han, Liansheng Wang, Rongshan Yu
https://arxiv.org/abs/2402.19326

@arXiv_eessIV_bot@mastoxiv.page
2024-04-18 06:53:58

Automatic classification of prostate MR series type using image content and metadata
Deepa Krishnaswamy, B\'alint Kov\'acs, Stefan Denner, Steve Pieper, David Clunie, Christopher P. Bridge, Tina Kapur, Klaus H. Maier-Hein, Andrey Fedorov
https://arxiv.org/abs/2404.10892

Automatic classification of prostate MR series type using image content and metadata
With the wealth of medical image data, efficient curation is essential. Assigning the sequence type to magnetic resonance images is necessary for scientific studies and artificial intelligence-based analysis. However, incomplete or missing metadata prevents effective automation. We therefore propose a deep-learning method for classification of prostate cancer scanning sequences based on a combination of image data and DICOM metadata. We demonstrate superior results compared to metadata or image…

@arXiv_csCR_bot@mastoxiv.page
2024-04-17 08:28:12

This https://arxiv.org/abs/2306.08538 has been replaced.
link: https://scholar.google.com/scholar?q=a

Fast and Private Inference of Deep Neural Networks by Co-designing Activation Functions
Machine Learning as a Service (MLaaS) is an increasingly popular design where a company with abundant computing resources trains a deep neural network and offers query access for tasks like image classification. The challenge with this design is that MLaaS requires the client to reveal their potentially sensitive queries to the company hosting the model. Multi-party computation (MPC) protects the client's data by allowing encrypted inferences. However, current approaches suffer from prohibitive…

@arXiv_csCV_bot@mastoxiv.page
2024-03-01 07:06:30

Stitching Gaps: Fusing Situated Perceptual Knowledge with Vision Transformers for High-Level Image Classification
Delfina Sol Martinez Pandiani, Nicolas Lazzari, Valentina Presutti
https://arxiv.org/abs/2402.19339

@arXiv_eessIV_bot@mastoxiv.page
2024-04-01 06:53:57

A multi-stage semi-supervised learning for ankle fracture classification on CT images
Hongzhi Liu, Guicheng Li, Jiacheng Nie, Hui Tang, Chunfeng Yang, Qianjin Feng, Hailin Xu, Yang Chen
https://arxiv.org/abs/2403.19983

A multi-stage semi-supervised learning for ankle fracture classification on CT images
Because of the complicated mechanism of ankle injury, it is very difficult to diagnose ankle fracture in clinic. In order to simplify the process of fracture diagnosis, an automatic diagnosis model of ankle fracture was proposed. Firstly, a tibia-fibula segmentation network is proposed for the joint tibiofibular region of the ankle joint, and the corresponding segmentation dataset is established on the basis of fracture data. Secondly, the image registration method is used to register the bone …

@arXiv_eessIV_bot@mastoxiv.page
2024-04-01 06:53:57

A multi-stage semi-supervised learning for ankle fracture classification on CT images
Hongzhi Liu, Guicheng Li, Jiacheng Nie, Hui Tang, Chunfeng Yang, Qianjin Feng, Hailin Xu, Yang Chen
https://arxiv.org/abs/2403.19983

A multi-stage semi-supervised learning for ankle fracture classification on CT images
Because of the complicated mechanism of ankle injury, it is very difficult to diagnose ankle fracture in clinic. In order to simplify the process of fracture diagnosis, an automatic diagnosis model of ankle fracture was proposed. Firstly, a tibia-fibula segmentation network is proposed for the joint tibiofibular region of the ankle joint, and the corresponding segmentation dataset is established on the basis of fracture data. Secondly, the image registration method is used to register the bone …

@arXiv_csCV_bot@mastoxiv.page
2024-02-13 14:32:43

This https://arxiv.org/abs/2301.04494 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Multi-label Image Classification using Adaptive Graph Convolutional Networks: from a Single Domain to Multiple Domains
This paper proposes an adaptive graph-based approach for multi-label image classification. Graph-based methods have been largely exploited in the field of multi-label classification, given their ability to model label correlations. Specifically, their effectiveness has been proven not only when considering a single domain but also when taking into account multiple domains. However, the topology of the used graph is not optimal as it is pre-defined heuristically. In addition, consecutive Graph C…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-16 07:35:06

Breast Cancer Image Classification Method Based on Deep Transfer Learning
Weimin Wang, Min Gao, Mingxuan Xiao, Xu Yan, Yufeng Li
https://arxiv.org/abs/2404.09226

Breast Cancer Image Classification Method Based on Deep Transfer Learning
To address the issues of limited samples, time-consuming feature design, and low accuracy in detection and classification of breast cancer pathological images, a breast cancer image classification model algorithm combining deep learning and transfer learning is proposed. This algorithm is based on the DenseNet structure of deep neural networks, and constructs a network model by introducing attention mechanisms, and trains the enhanced dataset using multi-level transfer learning. Experimental re…

@arXiv_csCV_bot@mastoxiv.page
2024-03-19 07:26:59

Distilling Datasets Into Less Than One Image
Asaf Shul, Eliahu Horwitz, Yedid Hoshen
https://arxiv.org/abs/2403.12040 https://arxiv.o…

Distilling Datasets Into Less Than One Image
Dataset distillation aims to compress a dataset into a much smaller one so that a model trained on the distilled dataset achieves high accuracy. Current methods frame this as maximizing the distilled classification accuracy for a budget of K distilled images-per-class, where K is a positive integer. In this paper, we push the boundaries of dataset distillation, compressing the dataset into less than an image-per-class. It is important to realize that the meaningful quantity is not the number of…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-01 08:37:15

This https://arxiv.org/abs/2402.17187 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction
The early detection of a pulmonary embolism (PE) is critical for enhancing patient survival rates. Both image-based and non-image-based features are of utmost importance in medical classification tasks. In a clinical setting, physicians tend to rely on the contextual information provided by Electronic Medical Records (EMR) to interpret medical imaging. However, very few models effectively integrate clinical information with imaging data. To address this shortcoming, we suggest a multimodal fusi…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-24 08:33:56

This https://arxiv.org/abs/2308.04956 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

Improved cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement
Due to the extremely low signal-to-noise ratio (SNR) and unknown poses (projection angles and image shifts) in cryo-electron microscopy (cryo-EM) experiments, reconstructing 3D volumes from 2D images is very challenging. In addition to these challenges, heterogeneous cryo-EM reconstruction requires conformational classification. In popular cryo-EM reconstruction algorithms, poses and conformation classification labels must be predicted for every input cryo-EM image, which can be computationally…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-30 07:34:02

SPLICE -- Streamlining Digital Pathology Image Processing
Areej Alsaafin, Peyman Nejat, Abubakr Shafique, Jibran Khan, Saghir Alfasly, Ghazal Alabtah, H. R. Tizhoosh
https://arxiv.org/abs/2404.17704 https://arxiv.org/pdf/2404.17704
arXiv:2404.17704v1 Announce Type: new
Abstract: Digital pathology and the integration of artificial intelligence (AI) models have revolutionized histopathology, opening new opportunities. With the increasing availability of Whole Slide Images (WSIs), there's a growing demand for efficient retrieval, processing, and analysis of relevant images from vast biomedical archives. However, processing WSIs presents challenges due to their large size and content complexity. Full computer digestion of WSIs is impractical, and processing all patches individually is prohibitively expensive. In this paper, we propose an unsupervised patching algorithm, Sequential Patching Lattice for Image Classification and Enquiry (SPLICE). This novel approach condenses a histopathology WSI into a compact set of representative patches, forming a "collage" of WSI while minimizing redundancy. SPLICE prioritizes patch quality and uniqueness by sequentially analyzing a WSI and selecting non-redundant representative features. We evaluated SPLICE for search and match applications, demonstrating improved accuracy, reduced computation time, and storage requirements compared to existing state-of-the-art methods. As an unsupervised method, SPLICE effectively reduces storage requirements for representing tissue images by 50%. This reduction enables numerous algorithms in computational pathology to operate much more efficiently, paving the way for accelerated adoption of digital pathology.

@arXiv_eessIV_bot@mastoxiv.page
2024-04-23 08:44:34

This https://arxiv.org/abs/2308.04956 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

Improved cryo-EM Pose Estimation and 3D Classification through Latent-Space Disentanglement
Due to the extremely low signal-to-noise ratio (SNR) and unknown poses (projection angles and image shifts) in cryo-electron microscopy (cryo-EM) experiments, reconstructing 3D volumes from 2D images is very challenging. In addition to these challenges, heterogeneous cryo-EM reconstruction requires conformational classification. In popular cryo-EM reconstruction algorithms, poses and conformation classification labels must be predicted for every input cryo-EM image, which can be computationally…

@arXiv_csCV_bot@mastoxiv.page
2024-02-12 08:30:50

This https://arxiv.org/abs/2304.02621 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

High-fidelity Pseudo-labels for Boosting Weakly-Supervised Segmentation
Image-level weakly-supervised semantic segmentation (WSSS) reduces the usually vast data annotation cost by surrogate segmentation masks during training. The typical approach involves training an image classification network using global average pooling (GAP) on convolutional feature maps. This enables the estimation of object locations based on class activation maps (CAMs), which identify the importance of image regions. The CAMs are then used to generate pseudo-labels, in the form of segmenta…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-13 06:53:54

A slice classification neural network for automated classification of axial PET/CT slices from a multi-centric lymphoma dataset
Shadab Ahamed, Yixi Xu, Ingrid Bloise, Joo H. O, Carlos F. Uribe, Rahul Dodhia, Juan L. Ferres, Arman Rahmim
https://arxiv.org/abs/2403.07105

A slice classification neural network for automated classification of axial PET/CT slices from a multi-centric lymphoma dataset
Automated slice classification is clinically relevant since it can be incorporated into medical image segmentation workflows as a preprocessing step that would flag slices with a higher probability of containing tumors, thereby directing physicians attention to the important slices. In this work, we train a ResNet-18 network to classify axial slices of lymphoma PET/CT images (collected from two institutions) depending on whether the slice intercepted a tumor (positive slice) in the 3D image or …

@arXiv_eessIV_bot@mastoxiv.page
2024-05-03 08:48:51

This https://arxiv.org/abs/2308.01381 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

Estimation of motion blur kernel parameters using regression convolutional neural networks
Many deblurring and blur kernel estimation methods use a maximum a posteriori (MAP) approach or deep learning-based classification techniques to sharpen an image and/or predict the blur kernel. We propose a regression approach using convolutional neural networks (CNNs) to predict parameters of linear motion blur kernels, the length and orientation of the blur. We analyze the relationship between length and angle of linear motion blur that can be represented as digital filter kernels. A large da…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-30 07:34:07

Spatial, Temporal, and Geometric Fusion for Remote Sensing Images
Hessah Albanwan
https://arxiv.org/abs/2404.17851 https://arxiv.org/pdf/2404.17851
arXiv:2404.17851v1 Announce Type: new
Abstract: Remote sensing (RS) images are important to monitor and survey earth at varying spatial scales. Continuous observations from various RS sources complement single observations to improve applications. Fusion into single or multiple images provides more informative, accurate, complete, and coherent data. Studies intensively investigated spatial-temporal fusion for specific applications like pan-sharpening and spatial-temporal fusion for time-series analysis. Fusion methods can process different images, modalities, and tasks and are expected to be robust and adaptive to various types of images (e.g., spectral images, classification maps, and elevation maps) and scene complexities. This work presents solutions to improve existing fusion methods that process gridded data and consider their type-specific uncertainties. The contributions include: 1) A spatial-temporal filter that addresses spectral heterogeneity of multitemporal images. 2) 3D iterative spatiotemporal filter that enhances spatiotemporal inconsistencies of classification maps. 3) Adaptive semantic-guided fusion that enhances the accuracy of DSMs and compares them with traditional fusion approaches to show the significance of adaptive methods. 4) A comprehensive analysis of DL stereo matching methods against traditional Census-SGM to obtain detailed knowledge on the accuracy of the DSMs at the stereo matching level. We analyze the overall performance, robustness, and generalization capability, which helps identify the limitations of current DSM generation methods. 5) Based on previous analysis, we develop a novel finetuning strategy to enhance transferability of DL stereo matching methods, hence, the accuracy of DSMs. Our work shows the importance of spatial, temporal, and geometric fusion in enhancing RS applications. It shows that the fusion problem is case-specific and depends on the image type, scene content, and application.

@arXiv_eessIV_bot@mastoxiv.page
2024-03-20 06:53:53

Generalizing deep learning models for medical image classification
Matta Sarah, Lamard Mathieu, Zhang Philippe, Alexandre Le Guilcher, Laurent Borderie, B\'eatrice Cochener, Gwenol\'e Quellec
https://arxiv.org/abs/2403.12167

Generalizing deep learning models for medical image classification
Numerous Deep Learning (DL) models have been developed for a large spectrum of medical image analysis applications, which promises to reshape various facets of medical practice. Despite early advances in DL model validation and implementation, which encourage healthcare institutions to adopt them, some fundamental questions remain: are the DL models capable of generalizing? What causes a drop in DL model performances? How to overcome the DL model performance drop? Medical data are dynamic and p…

@arXiv_csCV_bot@mastoxiv.page
2024-02-14 08:29:31

This https://arxiv.org/abs/2402.06198 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

GS-CLIP: Gaussian Splatting for Contrastive Language-Image-3D Pretraining from Real-World Data
3D Shape represented as point cloud has achieve advancements in multimodal pre-training to align image and language descriptions, which is curial to object identification, classification, and retrieval. However, the discrete representations of point cloud lost the object's surface shape information and creates a gap between rendering results and 2D correspondences. To address this problem, we propose GS-CLIP for the first attempt to introduce 3DGS (3D Gaussian Splatting) into multimodal pre-tra…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-20 06:53:53

Generalizing deep learning models for medical image classification
Matta Sarah, Lamard Mathieu, Zhang Philippe, Alexandre Le Guilcher, Laurent Borderie, B\'eatrice Cochener, Gwenol\'e Quellec
https://arxiv.org/abs/2403.12167

Generalizing deep learning models for medical image classification
Numerous Deep Learning (DL) models have been developed for a large spectrum of medical image analysis applications, which promises to reshape various facets of medical practice. Despite early advances in DL model validation and implementation, which encourage healthcare institutions to adopt them, some fundamental questions remain: are the DL models capable of generalizing? What causes a drop in DL model performances? How to overcome the DL model performance drop? Medical data are dynamic and p…

@arXiv_eessIV_bot@mastoxiv.page
2024-02-28 06:53:59

PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction
Zhaoxin Guo, Zhipeng Wang, Ruiquan Ge, Jianxun Yu, Feiwei Qin, Yuan Tian, Yuqing Peng, Yonghong Li, Changmiao Wang
https://arxiv.org/abs/2402.17187

PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction
The early detection of a pulmonary embolism (PE) is critical for enhancing patient survival rates. Both image-based and non-image-based features are of utmost importance in medical classification tasks. In a clinical setting, physicians tend to rely on the contextual information provided by Electronic Medical Records (EMR) to interpret medical imaging. However, very few models effectively integrate clinical information with imaging data. To address this shortcoming, we suggest a multimodal fusi…

@arXiv_csCV_bot@mastoxiv.page
2024-04-12 08:31:24

This https://arxiv.org/abs/2404.06859 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Multi-Label Continual Learning for the Medical Domain: A Novel Benchmark
Multi-label image classification in dynamic environments is a problem that poses significant challenges. Previous studies have primarily focused on scenarios such as Domain Incremental Learning and Class Incremental Learning, which do not fully capture the complexity of real-world applications. In this paper, we study the problem of classification of medical imaging in the scenario termed New Instances \& New Classes, which combines the challenges of both new class arrivals and domain shifts in…

@arXiv_csCV_bot@mastoxiv.page
2024-02-13 14:33:25

This https://arxiv.org/abs/2310.06085 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Quantile-based Maximum Likelihood Training for Outlier Detection
Discriminative learning effectively predicts true object class for image classification. However, it often results in false positives for outliers, posing critical concerns in applications like autonomous driving and video surveillance systems. Previous attempts to address this challenge involved training image classifiers through contrastive learning using actual outlier data or synthesizing outliers for self-supervised learning. Furthermore, unsupervised generative modeling of inliers in pixe…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-22 06:53:51

Pneumonia Diagnosis through pixels -- A Deep Learning Model for detection and classification
Amit Karanth Gurpur, Janani S, Ajeetha B, Brintha Therese A, Rajeswaran Rangasami
https://arxiv.org/abs/2404.12405

Pneumonia Diagnosis through pixels -- A Deep Learning Model for detection and classification
Manual identification and classification of pneumonia and COVID-19 infection is a cumbersome process that, if delayed can cause irreversible damage to the patient. We have compiled CT scan images from various sources, namely, from the China Consortium of Chest CT Image Investigation (CC-CCII), the Negin Radiology located at Sari in Iran, an open access COVID-19 repository from Havard dataverse, and Sri Ramachandra University, Chennai, India. The images were preprocessed using various methods su…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-22 08:37:56

This https://arxiv.org/abs/2403.03849 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

MedMamba: Vision Mamba for Medical Image Classification
Medical image classification is a very fundamental and crucial task in the field of computer vision. These years, CNN-based and Transformer-based models have been widely used to classify various medical images. Unfortunately, The limitation of CNNs in long-range modeling capabilities prevents them from effectively extracting features in medical images, while Transformers are hampered by their quadratic computational complexity. Recent research has shown that the state space model (SSM) represen…

@arXiv_csCV_bot@mastoxiv.page
2024-02-12 08:30:45

This https://arxiv.org/abs/2302.05262 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Evaluation of Data Augmentation and Loss Functions in Semantic Image Segmentation for Drilling Tool Wear Detection
Tool wear monitoring is crucial for quality control and cost reduction in manufacturing processes, of which drilling applications are one example. In this paper, we present a U-Net based semantic image segmentation pipeline, deployed on microscopy images of cutting inserts, for the purpose of wear detection. The wear area is differentiated in two different types, resulting in a multiclass classification problem. Joining the two wear types in one general wear class, on the other hand, allows the…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-12 07:35:47

Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification
Shuai Li, Xiaoguang Ma, Shancheng Jiang, Lu Meng
https://arxiv.org/abs/2403.06798

Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification
Remarkable successes were made in Medical Image Classification (MIC) recently, mainly due to wide applications of convolutional neural networks (CNNs). However, adversarial examples (AEs) exhibited imperceptible similarity with raw data, raising serious concerns on network robustness. Although adversarial training (AT), in responding to malevolent AEs, was recognized as an effective approach to improve robustness, it was challenging to overcome generalization decline of networks caused by the A…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-12 07:35:47

Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification
Shuai Li, Xiaoguang Ma, Shancheng Jiang, Lu Meng
https://arxiv.org/abs/2403.06798

Dynamic Perturbation-Adaptive Adversarial Training on Medical Image Classification
Remarkable successes were made in Medical Image Classification (MIC) recently, mainly due to wide applications of convolutional neural networks (CNNs). However, adversarial examples (AEs) exhibited imperceptible similarity with raw data, raising serious concerns on network robustness. Although adversarial training (AT), in responding to malevolent AEs, was recognized as an effective approach to improve robustness, it was challenging to overcome generalization decline of networks caused by the A…

@arXiv_eessIV_bot@mastoxiv.page
2024-02-27 06:54:05

Integrating Preprocessing Methods and Convolutional Neural Networks for Effective Tumor Detection in Medical Imaging
Ha Anh Vu
https://arxiv.org/abs/2402.16221

Integrating Preprocessing Methods and Convolutional Neural Networks for Effective Tumor Detection in Medical Imaging
This research presents a machine-learning approach for tumor detection in medical images using convolutional neural networks (CNNs). The study focuses on preprocessing techniques to enhance image features relevant to tumor detection, followed by developing and training a CNN model for accurate classification. Various image processing techniques, including Gaussian smoothing, bilateral filtering, and K-means clustering, are employed to preprocess the input images and highlight tumor regions. The…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-21 06:53:59

SIFT-DBT: Self-supervised Initialization and Fine-Tuning for Imbalanced Digital Breast Tomosynthesis Image Classification
Yuexi Du, Regina J. Hooley, John Lewin, Nicha C. Dvornek
https://arxiv.org/abs/2403.13148

SIFT-DBT: Self-supervised Initialization and Fine-Tuning for Imbalanced Digital Breast Tomosynthesis Image Classification
Digital Breast Tomosynthesis (DBT) is a widely used medical imaging modality for breast cancer screening and diagnosis, offering higher spatial resolution and greater detail through its 3D-like breast volume imaging capability. However, the increased data volume also introduces pronounced data imbalance challenges, where only a small fraction of the volume contains suspicious tissue. This further exacerbates the data imbalance due to the case-level distribution in real-world data and leads to l…

@arXiv_csCV_bot@mastoxiv.page
2024-02-16 08:31:05

This https://arxiv.org/abs/2402.01188 has been replaced.
link: https://scholar.google.com/scholar?q=a

Segment Any Change
Visual foundation models have achieved remarkable results in zero-shot image classification and segmentation, but zero-shot change detection remains an open problem. In this paper, we propose the segment any change models (AnyChange), a new type of change detection model that supports zero-shot prediction and generalization on unseen change types and data distributions. AnyChange is built on the segment anything model (SAM) via our training-free adaptation method, bitemporal latent matching. By…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-15 06:53:59

Randomized Principal Component Analysis for Hyperspectral Image Classification
Mustafa Ustuner
https://arxiv.org/abs/2403.09117 https://

Randomized Principal Component Analysis for Hyperspectral Image Classification
The high-dimensional feature space of the hyperspectral imagery poses major challenges to the processing and analysis of the hyperspectral data sets. In such a case, dimensionality reduction is necessary to decrease the computational complexity. The random projections open up new ways of dimensionality reduction, especially for large data sets. In this paper, the principal component analysis (PCA) and randomized principal component analysis (R-PCA) for the classification of hyperspectral images…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-12 06:53:58

Learning to Classify New Foods Incrementally Via Compressed Exemplars
Justin Yang, Zhihao Duan, Jiangpeng He, Fengqing Zhu
https://arxiv.org/abs/2404.07507

Learning to Classify New Foods Incrementally Via Compressed Exemplars
Food image classification systems play a crucial role in health monitoring and diet tracking through image-based dietary assessment techniques. However, existing food recognition systems rely on static datasets characterized by a pre-defined fixed number of food classes. This contrasts drastically with the reality of food consumption, which features constantly changing data. Therefore, food image classification systems should adapt to and manage data that continuously evolves. This is where con…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-12 07:35:46

Shortcut Learning in Medical Image Segmentation
Manxi Lin, Nina Weng, Kamil Mikolaj, Zahra Bashir, Morten Bo S{\o}ndergaard Svendsen, Martin Tolsgaard, Anders Nymark Christensen, Aasa Feragen
https://arxiv.org/abs/2403.06748

Shortcut Learning in Medical Image Segmentation
Shortcut learning is a phenomenon where machine learning models prioritize learning simple, potentially misleading cues from data that do not generalize well beyond the training set. While existing research primarily investigates this in the realm of image classification, this study extends the exploration of shortcut learning into medical image segmentation. We demonstrate that clinical annotations such as calipers, and the combination of zero-padded convolutions and center-cropped training se…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-12 07:35:46

Shortcut Learning in Medical Image Segmentation
Manxi Lin, Nina Weng, Kamil Mikolaj, Zahra Bashir, Morten Bo S{\o}ndergaard Svendsen, Martin Tolsgaard, Anders Nymark Christensen, Aasa Feragen
https://arxiv.org/abs/2403.06748

Shortcut Learning in Medical Image Segmentation
Shortcut learning is a phenomenon where machine learning models prioritize learning simple, potentially misleading cues from data that do not generalize well beyond the training set. While existing research primarily investigates this in the realm of image classification, this study extends the exploration of shortcut learning into medical image segmentation. We demonstrate that clinical annotations such as calipers, and the combination of zero-padded convolutions and center-cropped training se…

@arXiv_csCV_bot@mastoxiv.page
2024-02-13 14:32:46

This https://arxiv.org/abs/2302.02108 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_csCV_…

Knowledge Distillation in Vision Transformers: A Critical Review
In Natural Language Processing (NLP), Transformers have already revolutionized the field by utilizing an attention-based encoder-decoder model. Recently, some pioneering works have employed Transformer-like architectures in Computer Vision (CV) and they have reported outstanding performance of these architectures in tasks such as image classification, object detection, and semantic segmentation. Vision Transformers (ViTs) have demonstrated impressive performance improvements over Convolutional …

@arXiv_eessIV_bot@mastoxiv.page
2024-02-21 08:33:34

This https://arxiv.org/abs/2303.05789 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

AnoMalNet: Outlier Detection based Malaria Cell Image Classification Method Leveraging Deep Autoencoder
Class imbalance is a pervasive issue in the field of disease classification from medical images. It is necessary to balance out the class distribution while training a model for decent results. However, in the case of rare medical diseases, images from affected patients are much harder to come by compared to images from non-affected patients, resulting in unwanted class imbalance. Various processes of tackling class imbalance issues have been explored so far, each having its fair share of drawb…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-30 07:34:40

Self-supervised learning for classifying paranasal anomalies in the maxillary sinus
Debayan Bhattacharya, Finn Behrendt, Benjamin Tobias Becker, Lennart Maack, Dirk Beyersdorff, Elina Petersen, Marvin Petersen, Bastian Cheng, Dennis Eggert, Christian Betz, Anna Sophie Hoffmann, Alexander Schlaefer
https://arxiv.org/abs/2404.18599 https://arxiv.org/pdf/2404.18599
arXiv:2404.18599v1 Announce Type: new
Abstract: Purpose: Paranasal anomalies, frequently identified in routine radiological screenings, exhibit diverse morphological characteristics. Due to the diversity of anomalies, supervised learning methods require large labelled dataset exhibiting diverse anomaly morphology. Self-supervised learning (SSL) can be used to learn representations from unlabelled data. However, there are no SSL methods designed for the downstream task of classifying paranasal anomalies in the maxillary sinus (MS).
Methods: Our approach uses a 3D Convolutional Autoencoder (CAE) trained in an unsupervised anomaly detection (UAD) framework. Initially, we train the 3D CAE to reduce reconstruction errors when reconstructing normal maxillary sinus (MS) image. Then, this CAE is applied to an unlabelled dataset to generate coarse anomaly locations by creating residual MS images. Following this, a 3D Convolutional Neural Network (CNN) reconstructs these residual images, which forms our SSL task. Lastly, we fine-tune the encoder part of the 3D CNN on a labelled dataset of normal and anomalous MS images.
Results: The proposed SSL technique exhibits superior performance compared to existing generic self-supervised methods, especially in scenarios with limited annotated data. When trained on just 10% of the annotated dataset, our method achieves an Area Under the Precision-Recall Curve (AUPRC) of 0.79 for the downstream classification task. This performance surpasses other methods, with BYOL attaining an AUPRC of 0.75, SimSiam at 0.74, SimCLR at 0.73 and Masked Autoencoding using SparK at 0.75.
Conclusion: A self-supervised learning approach that inherently focuses on localizing paranasal anomalies proves to be advantageous, particularly when the subsequent task involves differentiating normal from anomalous maxillary sinuses. Access our code at https://github.com/mtec-tuhh/self-supervised-paranasal-anomaly

@arXiv_eessIV_bot@mastoxiv.page
2024-03-22 08:38:10

This https://arxiv.org/abs/2403.12167 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

Generalizing deep learning models for medical image classification
Numerous Deep Learning (DL) models have been developed for a large spectrum of medical image analysis applications, which promises to reshape various facets of medical practice. Despite early advances in DL model validation and implementation, which encourage healthcare institutions to adopt them, some fundamental questions remain: are the DL models capable of generalizing? What causes a drop in DL model performances? How to overcome the DL model performance drop? Medical data are dynamic and p…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-15 07:35:18

Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example
MingXuan Xiao, Yufeng Li, Xu Yan, Min Gao, Weimin Wang
https://arxiv.org/abs/2404.08279

Convolutional neural network classification of cancer cytopathology images: taking breast cancer as an example
Breast cancer is a relatively common cancer among gynecological cancers. Its diagnosis often relies on the pathology of cells in the lesion. The pathological diagnosis of breast cancer not only requires professionals and time, but also sometimes involves subjective judgment. To address the challenges of dependence on pathologists expertise and the time-consuming nature of achieving accurate breast pathological image classification, this paper introduces an approach utilizing convolutional neura…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-28 08:32:44

This https://arxiv.org/abs/2308.13356 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

CEIMVEN: An Approach of Cutting Edge Implementation of Modified Versions of EfficientNet (V1-V2) Architecture for Breast Cancer Detection and Classification from Ultrasound Images
Undoubtedly breast cancer identifies itself as one of the most widespread and terrifying cancers across the globe. Millions of women are getting affected each year from it. Breast cancer remains the major one for being the reason of largest number of demise of women. In the recent time of research, Medical Image Computing and Processing has been playing a significant role for detecting and classifying breast cancers from ultrasound images and mammograms, along with the celestial touch of deep n…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-18 08:38:16

This https://arxiv.org/abs/2402.17187 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

PE-MVCNet: Multi-view and Cross-modal Fusion Network for Pulmonary Embolism Prediction
The early detection of a pulmonary embolism (PE) is critical for enhancing patient survival rates. Both image-based and non-image-based features are of utmost importance in medical classification tasks. In a clinical setting, physicians tend to rely on the contextual information provided by Electronic Medical Records (EMR) to interpret medical imaging. However, very few models effectively integrate clinical information with imaging data. To address this shortcoming, we suggest a multimodal fusi…

@arXiv_eessIV_bot@mastoxiv.page
2024-04-16 09:00:08

This https://arxiv.org/abs/2404.03883 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

LiDAR-Guided Cross-Attention Fusion for Hyperspectral Band Selection and Image Classification
The fusion of hyperspectral and LiDAR data has been an active research topic. Existing fusion methods have ignored the high-dimensionality and redundancy challenges in hyperspectral images, despite that band selection methods have been intensively studied for hyperspectral image (HSI) processing. This paper addresses this significant gap by introducing a cross-attention mechanism from the transformer architecture for the selection of HSI bands guided by LiDAR data. LiDAR provides high-resolutio…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-26 08:50:41

This https://arxiv.org/abs/2308.13356 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

CEIMVEN: An Approach of Cutting Edge Implementation of Modified Versions of EfficientNet (V1-V2) Architecture for Breast Cancer Detection and Classification from Ultrasound Images
Undoubtedly breast cancer identifies itself as one of the most widespread and terrifying cancers across the globe. Millions of women are getting affected each year from it. Breast cancer remains the major one for being the reason of largest number of demise of women. In the recent time of research, Medical Image Computing and Processing has been playing a significant role for detecting and classifying breast cancers from ultrasound images and mammograms, along with the celestial touch of deep n…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-26 08:51:27

This https://arxiv.org/abs/2310.09457 has been replaced.
initial toot: https://mastoxiv.page/@arXiv_ees…

UCM-Net: A Lightweight and Efficient Solution for Skin Lesion Segmentation using MLP and CNN
Skin cancer is a significant public health problem, and computer-aided diagnosis can help to prevent and treat it. A crucial step for computer-aided diagnosis is accurately segmenting skin lesions in images, which allows for lesion detection, classification, and analysis. However, this task is challenging due to the diverse characteristics of lesions, such as appearance, shape, size, color, texture, and location, as well as image quality issues like noise, artifacts, and occlusions. Deep learni…

@arXiv_eessIV_bot@mastoxiv.page
2024-03-26 07:29:49

Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer
Dominik M\"uller, Philip Meyer, Lukas Rentschler, Robin Manz, Daniel Hieber, Jonas B\"acker, Samantha Cramer, Christoph Wengenmayr, Bruno M\"arkl, Ralf Huss, Frank Kramer, I\~naki Soto-Rey, Johannes Raffler
https://arxiv.or…

Assessing the Performance of Deep Learning for Automated Gleason Grading in Prostate Cancer
Prostate cancer is a dominant health concern calling for advanced diagnostic tools. Utilizing digital pathology and artificial intelligence, this study explores the potential of 11 deep neural network architectures for automated Gleason grading in prostate carcinoma focusing on comparing traditional and recent architectures. A standardized image classification pipeline, based on the AUCMEDI framework, facilitated robust evaluation using an in-house dataset consisting of 34,264 annotated tissue …

@arXiv_eessIV_bot@mastoxiv.page
2024-03-26 07:29:47

DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks
Dominik M\"uller, Philip Meyer, Lukas Rentschler, Robin Manz, Jonas B\"acker, Samantha Cramer, Christoph Wengenmayr, Bruno M\"arkl, Ralf Huss, I\~naki Soto-Rey, Johannes Raffler
https://arxiv.org/abs/2403.16678…

DeepGleason: a System for Automated Gleason Grading of Prostate Cancer using Deep Neural Networks
Advances in digital pathology and artificial intelligence (AI) offer promising opportunities for clinical decision support and enhancing diagnostic workflows. Previous studies already demonstrated AI's potential for automated Gleason grading, but lack state-of-the-art methodology and model reusability. To address this issue, we propose DeepGleason: an open-source deep neural network based image classification system for automated Gleason grading using whole-slide histopathology images from pros…

@arXiv_eessIV_bot@mastoxiv.page
2024-02-13 12:56:12

Comparative Analysis of ImageNet Pre-Trained Deep Learning Models and DINOv2 in Medical Imaging Classification
Yuning Huang, Jingchen Zou, Lanxi Meng, Xin Yue, Qing Zhao, Jianqiang Li, Changwei Song, Gabriel Jimenez, Shaowu Li, Guanghui Fu
https://arxiv.org/abs/2402.07595

Comparative Analysis of ImageNet Pre-Trained Deep Learning Models and DINOv2 in Medical Imaging Classification
Medical image analysis frequently encounters data scarcity challenges. Transfer learning has been effective in addressing this issue while conserving computational resources. The recent advent of foundational models like the DINOv2, which uses the vision transformer architecture, has opened new opportunities in the field and gathered significant interest. However, DINOv2's performance on clinical data still needs to be verified. In this paper, we performed a glioma grading task using three clin…

Tootfinder

Opt-in global Mastodon full text search. Join the index!